NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Data Abstraction Elephants: The Initial Diversity of Data Representations and Mental Models

https://doi.org/10.1145/3544548.3580669

Williams, Katy; Bigelow, Alex; Isaacs, Katherine E. (April 2023, Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems)

Two people looking at the same dataset will create different mental models, prioritize different attributes, and connect with different visualizations. We seek to understand the space of data abstractions associated with mental models and how well people communicate their mental models when sketching. Data abstractions have a profound influence on the visualization design, yet it’s unclear how universal they may be when not initially influenced by a representation. We conducted a study about how people create their mental models from a dataset. Rather than presenting tabular data, we presented each participant with one of three datasets in paragraph form, to avoid biasing the data abstraction and mental model. We observed various mental models, data abstractions, and depictions from the same dataset, and how these concepts are influenced by communication and purpose-seeking. Our results have implications for visualization design, especially during the discovery and data collection phase.
more » « less
Full Text Available
Guidelines For Pursuing and Revealing Data Abstractions

https://doi.org/10.1109/TVCG.2020.3030355

Bigelow, Alex; Williams, Katy; Isaacs, Katherine E. (February 2021, IEEE Transactions on Visualization and Computer Graphics)
null (Ed.)
Full Text Available
Usability and Performance Improvements in Hatchet

https://doi.org/10.1109/HUSTProtools51951.2020.00013

Brink, Stephanie; Lumsden, Ian; Scully-Allison, Connor; Williams, Katy; Pearce, Olga; Gamblin, Todd; Taufer, Michela; Isaacs, Katherine E.; Bhatele, Abhinav (November 2020, 2020 IEEE/ACM International Workshop on HPC User Support Tools (HUST) and Workshop on Programming and Performance Visualization Tools (ProTools))
null (Ed.)
Performance analysis is critical for pinpointing bottlenecks in parallel applications. Several profilers exist to instrument parallel programs on HPC systems and gather performance data. Hatchet is an open-source Python library that can read profiling output of several tools, and enables the user to perform a variety of programmatic analyses on hierarchical performance profiles. In this paper, we augment Hatchet to support new features: a query language for representing call path patterns that can be used to filter a calling context tree, visualization support for displaying and interacting with performance profiles, and new operations for performing analyses on multiple datasets. Additionally, we present performance optimizations in Hatchet’s HPCToolkit reader and the unify operation to enable scalable analysis of large datasets.
more » « less
Full Text Available
Visualizing a Moving Target: A Design Study on Task Parallel Programs in the Presence of Evolving Data and Concerns

https://doi.org/10.1109/TVCG.2019.2934285

Williams, Katy; Bigelow, Alex; Isaacs, Katherine E. (August 2019, IEEE Transactions on Visualization and Computer Graphics)

Common pitfalls in visualization projects include lack of data availability and the domain users' needs and focus changing too rapidly for the design process to complete. While it is often prudent to avoid such projects, we argue it can be beneficial to engage them in some cases as the visualization process can help refine data collection, solving a “chicken and egg” problem of having the data and tools to analyze it. We found this to be the case in the domain of task parallel computing where such data and tooling is an open area of research. Despite these hurdles, we conducted a design study. Through a tightly-coupled iterative design process, we built Atria, a multi-view execution graph visualization to support performance analysis. Atria simplifies the initial representation of the execution graph by aggregating nodes as related to their line of code. We deployed Atria on multiple platforms, some requiring design alteration. We describe how we adapted the design study methodology to the “moving target” of both the data and the domain experts' concerns and how this movement kept both the visualization and programming project healthy. We reflect on our process and discuss what factors allow the project to be successful in the presence of changing data and user needs.
more » « less
Full Text Available
JetLag: An Interactive, Asynchronous Array Computing Environment

https://doi.org/10.1145/3311790.3396657

Brandt, Steven R.; Bigelow, Alex; Sakin, Sayef Azad; Williams, Katy; Isaacs, Katherine E.; Huck, Kevin; Tohid, Rod; Wagle, Bibek; Shirzad, Shahrzad; Kaiser, Hartmut (July 2020, Practice and Experience in Advanced Research Computing (PEARC '20))
null (Ed.)
We describe an interactive computing environment called JetLag. JetLag implements the following features of Phylanx project: (1) Phylanx, a Python-based asynchronous array computing toolkit; (2) the APEX performance measurement library; (3) a performance visualization framework called Traveler; (4) the Tapis/Agave Science as a Service middleware; and (6) a container infrastructure that includes Docker-based Jupyter notebook for the client and a singularity image for the server. The running system starts with a user performing array computations on their workstation or laptop. If, at some point, the calculation the user is performing becomes sufficiently intensive or numerous, it can be packaged and sent to another machine where it will run (through the batch queue system if there is one), produce a result, and have that result sent back to the user’s local interface. Whether the calculation is local or remote, the user will be able to use APEX and Traveler to diagnose and fix performance related problems. The JetLag system is suitable for a variety of array computational tasks, including machine learning and exploratory data analysis.
more » « less
Full Text Available
Asynchronous Execution of Python Code on Task-Based Runtime Systems

https://doi.org/10.1109/ESPM2.2018.00009

Tohid, R.; Wagle, Bibek; Shirzad, Shahrzad; Diehl, Patrick; Serio, Adrian; Kheirkhahan, Alireza; Amini, Parsa; Williams, Katy; Isaacs, Kate; Huck, Kevin; et al (November 2018, 2018 IEEE/ACM 4th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2))

Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and artificial intelligence (AI), from utilizing performance benefits of such systems. Researchers and scientists favor high-productivity languages to avoid the inconvenience of programming in low-level languages and costs of acquiring the necessary skills required for programming at this level. In recent years, Python, with the support of linear algebra libraries like NumPy, has gained popularity despite facing limitations which prevent this code from distributed runs. Here we present a solution which maintains both high level programming abstractions as well as parallel and distributed efficiency. Phylanx, is an asynchronous array processing toolkit which transforms Python and NumPy operations into code which can be executed in parallel on HPC resources by mapping Python and NumPy functions and variables into a dependency tree executed by HPX, a general purpose, parallel, task-based runtime system written in C++. Phylanx additionally provides introspection and visualization capabilities for debugging and performance analysis. We have tested the foundations of our approach by comparing our implementation of widely used machine learning algorithms to accepted NumPy standards.
more » « less
Full Text Available

Search for: All records